Exploring wind data using the Meteostat

Group Members: Travis, Ira, Micah

Course: Data Science

Goal: Use ML models to predict wind trends in distinct U.S. regions


Source: Meteostat Python API

Dataset Type: Aggregated weather observations per station

Key Variables
  • wspd: Average wind speed (mph)
  • wdir: Mean wind direction (degrees)

Time Period: 2024

Models: Wind Speed, Wind Direction, Locations

Frame: Hourly and Daily


Using Pittsburgh station

Missing Values
temp       0
dwpt       0
rhum       0
prcp    1091
snow    8761
wdir       0
wspd       0
wpgt    8761
pres       0
tsun    8761
coco       6
dtype: int64

Lagged 1, 3, and 6 hours before

TimeSeriesSplit with n_splits = 5

Linear Regression
MAE:  3.223
RMSE: 4.351
R²:   0.675

HistGradientBoostingRegressor
MAE:  2.649
RMSE: 3.906
R²:   0.743

5 cities examined

Chicago, Denver, Miami, Phoenix, Seattle

Target Variable

Wind direction (degrees)

Training features
  • Numerical: temp, dwpt, rhum, pres, wpsd, latitude, longitude, elevation
  • Categorical: City
  • Modified: hour_sin(N), hour_cos(N), month_sin(N), month_cos(N), condition_group(C)
  • Excluded: tsun, snow, wpgt, prcp
Model Procedures
  • Hourly data sample from 2020-2024 on all cities
  • 80% train/20% test random split
  • StandardScaler() used on Numerical values
  • OneHotEncoder() used on Categorical values
Results
  • Linear Regression Model: R² = 0.329
  • HistGradientBoosting Model: R² = 0.631

*See visuals for further model comparisons


The features used to predict the weather are the wind speed and direction for the following weather stations:

  • Allegheny County Airport
  • Butler / Brownsdale
  • Washington / Lagonda
  • Zelienople / Old Furnace
  • Beaver Falls / West Mayfield

the values we are trying to predict is the wind speed and direction for Greater Pittsburgh International Airport

the model got the folowing metrics on the test set.

  • R²: 0.620057846963885
  • RMSE: 54.6469878681952

see the last code cell to see a visualization of the predicted vs actual wind speed.

potential improvements:

  • convert wind vector from angle and speed to its north-south and east-west components.
  • change more complex model.
  • predict more locations (given there closest stations)
  • add more features

R²: 0.620057846963885
RMSE: 54.64698786819519

  1. How do wind patterns change by region?
  2. What are some case studies of extreme weather?
  3. How do geographical features (lakes, oceans, mountains, deserts, plains) impact wind patterns?